RAG Evaluation Toolkit
GENERATOR
62.0%
The Generator is the LLM inside the RAG to generate the answers.
RETRIEVER
50.0%
The Retriever fetches relevant documents from the knowledge base according to a user query.
REWRITER
36.67%
The Rewriter modifies the user query to match a predefined format or to include the context from the chat history.
ROUTING
100.0%
The Router filters the query of the user based on his intentions (intentions detection).
KNOWLEDGE_BASE
0.0%
The knowledge base is the set of documents given to the RAG to generate the answers. Its scores is computed differently from the other components: it is the difference between the maximum and minimum correctness score across all the topics of the knowledge base.
Overall Correctness Score
53%
RECOMMENDATION
Focus on improving the RAG system's performance on conversational and distracting element questions as they currently have the lowest scores. Check the adequacy and efficiency of the retrieval system and the rewriter component in handling these questions; refer to topics with low scores such as Corporate Financial Analysis, Electric Vehicle Development, and Equity Securities Valuation to improve the knowledge base.
CORRECTNESS BY TOPIC
KNOWLEDGE BASE OVERVIEW
SELECTED METRICS